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Abstract 

This paper presents language techniques for applying memoization selectively. The 
techniques provide programmer control over equality, space usage, and identification 
of precise dependences so that memoization can be applied according to the needs 
of an application. Two key properties of the approach are that it accepts and efficient 
implementation and yields programs whose performance can be analyzed using standard 
analysis techniques. 

We describe our approach in the context of a functional language called MFL and 
an implementation as a Standard ML library. The MFL language employs a modal 
type system to enable the programmer to express programs that reveal their true data 
dependences when executed. We prove that the MFL language is sound by showing that 
that MFL programs yield the same result as they would with respect to a standard, 
non-memoizing semantics. The SML implementation cannot support the modal type 
system of MFL statically but instead employs run-time checks to ensure correct usage 
of primitives. 



1 Introduction 



Memoization is a fundamental and powerful technique for result re-use. It dates back a half 
century [Bellman, 1957, McCarthy, 1963, Michie, 1968] and has been used extensively in 
many areas such as dynamic programming [Aho et al., 1974, Cohen, 1983, Cormen et al., 
1990, Liu and Stoller, 1999], incremental computation (e.g., [Demers et al., 1981, Pugh and 
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Teitelbaum, 1989, Abadi et al., 1996, Liu ct al., 1998, Heydon et al., 2000]), and others [Bird, 
1980, Mostov and Cohen, 1985, Hughes, 1985, Norvig, 1991, Liu et al., 1998]. In fact, lazy 
evaluation provides a limited form of memoization [Peyton Jones, 1987]. 

Although memoization can dramatically improve performance and can require only small 
changes to the code, no language or library support for memoization has gained broad ac- 
ceptance. Instead, many successful uses of memoization rely on application-specific support 
code. The underlying reason for this is one of control: since memoization is all about perfor- 
mance, the user must be able to control the performance of memoization. Many subtleties 
of memoization, including the cost of equality checking and the cache replacement policy 
for memo tables, can make the difference between exponential and linear running time. 

To be general and widely applicable a memoization framework must provide control 
over these three areas: (1) the kind and cost of equality tests; (2) the identification of 
precise dependences between the input and the output of memoized code; and (3) space 
management. Control over equality tests is critical, because this is how rc-usablc results are 
identified. Identifying precise dependences is important to maximize result reuse. Being able 
to control when memo tables or individual entries are purged is critical, because otherwise 
the user will not know whether or when results are re-used. 

In this paper, we propose techniques for memoization that provide control over equality 
and identification of dependences, and some control over space management. We study 
the techniques in the context of a small language called MFL, which is a purely functional 
language enriched with support for user-controlled, selective memoization. We give several 
examples of the use of the language and we prove its type safety and correctness — i.e., 
that the semantics are preserved with respect to a non-memoized version. The operational 
semantics of MFL specifies the performance of programs accurately enough to determine (ex- 
pected) asymptotic time bounds.^ As an example, we show how to analyze the performance 
of a memoized version of Quicksort. The MFL language accepts an efficient implementa- 
tion with expected constant-time overhead by representing memo tables with nested hash 
tables. We give an implementation of MFL as a library for the Standard ML language. The 
implementation cannot support the modal type system of MFL statically; instead, it relies 
on run-time checks to ensure correct usage of memoization primitives. 

In the next section we describe background and related work. In Section 3 we introduce 
our approach via some examples. In Section 4 we formalize the MFL language and discuss 
its safety, correctness, and performance properties. In Section 5 we present a simple imple- 
mentation of the framework as a Standard ML library. In Section 6 we discuss several ways 
in which the approach may be extended. 

This paper extends the conference version [Acar et al., 2003] with the proofs for the 
correctness of the proposed approach and with a more detailed description of the imple- 
mentation. Although the implementation provided here is in the form of a simple library, 
some of the techniques proposed here have been implemented in CEAL [Hammer et al., 
2009] and Delta ML [Ley- Wild et al., 2008, Acar and Ley- Wild, 2009] languages that pro- 

^Expected, rather than worst-case, performance is required because of our reUance on hashing. 
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vide direct support for self-adjusting computation. 

2 Background and Related Work 

A typical memoization scheme maintains a memo table mapping argument values to pre- 
viously computed results. This tabic is consulted before each function call to determine if 
the particular argument is in the table. If so, the call is skipped and the result is returned; 
otherwise the call is performed and its result is added to the table. The semantics and 
implementation of the memo lookup are critical to performance. Here we review some key 
issues in implementing memoization efficiently. 

2.1 Equality 

Any memoization scheme needs to search a memo table for a match to the current argu- 
ments. Such a search, at minimum, requires a test for equality. Typically it also requires 
some form of hashing. In standard language implementations testing for equality on struc- 
tures, for example, can require traversing the whole structure. The cost of such an equality 
test can negate the advantage of memoizing and may even change the asymptotic behavior 
of the function. A few approaches have been proposed to alleviate this problem. The first 
is based on the fact that for memoization equality need not be exact — it can return unequal 
when two arguments are actually equal. The implementation could therefore decide to skip 
the test if the equality is too expensive, or could use a conservative equality test, such as 
"location" equality. The problem with such approaches is that whether a match is found 
could depend on particulars of the implementation and will surely not be evident to the 
programmer. 

Another approach for reducing the cost of equality tests is to ensure that there is only 
one copy of every value, via a technique known as "hash consing" [Goto and Kanada, 
1976, Allen, 1978, Spitzen and Levitt, 1978]. If there is only one copy, then equality can be 
implemented by comparing locations. In fact, the location can also be used as a key to a hash 
table. In theory, the overhead of hash-consing is constant in the expected case (expectation 
is over internal randomization of hash functions). In practice, hash-consing can be expensive 
because of large memory demands and interaction with garbage collection. In fact, several 
researchers have argued that hash-consing is too expensive for practical purposes [Pugh, 
1988, Appel and Gongalves, 1993, Murphy ct al., 2002]. As an alternative to hash consing, 
Pugh proposed lazy structure sharing [Pugh, 1988]. In lazy structure sharing whenever two 
equal values are compared, they are made to point to the same copy to speed up subsequent 
comparisons. As Pugh points out, the disadvantage of this approach is that the performance 
depends on the order comparisons and can therefore be difficult to analyze. 

We note that even with hash-consing, or any other method, it remains critical to define 
equality on all types including reals and functions. Claiming that functions are never 
equivalent, for example, is not satisfactory because the result of a call involving some 
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function as a parameter will never be re-used. 

2.2 Precise dependences 

To maximize result re-use, the result of a function call must be stored with respect to its true 
dependences. This issue arises when the function examines only parts or an approximation 
of its parameter. To enable "partial" equality checks, the unexamined parts of the parameter 
should be disregarded. To increase result re-use, the programmer should be able to match 
on the approximation, rather than the parameter itself. As an example, consider the code 

fun f(x,y,z) = if (x > 0) then fy(y) else fz(z) 

The result of f depends on either (x,y) or (x,z). Also, it depends on an approximation 
of x (whether or not it is positive) rather than its exact value. For example, the memo 
entry for f (7,11,20) should match the calls f (7,11,30) and f (4,11,50), since when x is 
positive, the result depends only on y. 

Several researchers have remarked that partial matching can be very important in some 
applications [Pennings et al., 1992, Pennings, 1994, Abadi et al., 1996, Heydon et al., 2000]. 
Abadi, Lampson, Levy [Abadi et al., 1996], and Heydon, Levin, Yu [Heydon et al., 2000] 
have suggested program analysis methods for tracking dependences for this purpose. Al- 
though their technique is likely effective in catching potential matches, it does not provide 
a programmer controlled mechanism for specifying what dependences should be tracked. 
Also, their program analysis technique can change the asymptotic complexity of a program, 
making it difficult to asses the effects of memoization. 

2.3 Space management 

Another problem with memoization is its space requirement. As a program executes, its 
memo tables can become large and limit the utility of memoization. To alleviate this 
problem, memo tables or individual entries should be disposed of under programmer control. 

In some applications, such as in dynamic programming, most result re-use occurs among 
the recursive calls of some function. Thus, the memo table of such a function can be disposed 
of whenever it terminates. This can be achieved by associating a memo table with a each 
memoized function and reclaiming the table when the function goes out of scope [Cook and 
Launchbury, 1997, Hughes, 1985]. 

In other applications, where result re-use is less structured, individual memo table en- 
tries should be purged according to a replacement policy [Hilden, 1976, Pugh, 1988]. The 
problem is to determine what exact replacement policy should be used and to analyze the 
performance effects of the chosen policy. One widely used approach is to replace the least 
recently used entry. Other, more sophisticated, policies have also been suggested [Pugh, 
1988]. In general the replacement policy must be application-specific, because, for any fixed 
policy, there are programs whose performance is made worse by that choice [Pugh, 1988]. 
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2.4 Memoization and dynamic dependence graphs 

The techniques presented in this paper were motivated by our previous work on adaptive 
computation [Acar et al., 2002]. In subsequent work [Acar, 2005, Acar et al., 2009, 2007], 
we showed that memoization and adaptive computation techniques are duals in the way 
that they provide for computation re-use. Based on this duality, we showed that they can 
be combined to provide an incremental-computation technique, called self-adjusting com- 
putation, that achieves efficiency for a reasonably broad range of applications. Perhaps one 
of the most interesting aspects of this combination is that it enables re-use of computations 
under mutations to memory [Acar et al., 2006a, 2007], which turns out to be critically 
important for effective re-use of computations. This later work subsequently has led to 
the development of CEAL [Hammer et al., 2009] and Delta ML languages [Ley- Wild et al., 
2008, Acar and Ley- Wild, 2009] for self-adjusting computation. These languages provide 
language constructs to enable the user to memoize expression as needed and support cre- 
ation of locally scoped memo tables. Delta ML additionally supports user-provided equality 
tests. 

3 A Framework for Selective Memoization 

We present an overview of our approach via several examples. The examples are written in 
an language that extends a purely functional, ML-like language with selective-memoization 
primitives. We formalize the core of this language and study its safety, soundness, and 
performance properties in Section 4. 

3.1 Incremental exploration with resources 

Our approach enables the programmer to determine the precise dependences between the 
input and the result of a function. The main idea is to deem the parameters of a function 
as resources and provide primitives to explore incrementally any value, including the un- 
derlying value of a resource. This incremental exploration process reveals the dependences 
between the parameter of the function and its result. 

The incremental exploration process is guided by types. If a value has the modal type 
! T, then the underlying value of type r can be bound to an ordinary, unrestricted variable 
by the let ! construct; this will create a dependence between the underlying value and the 
result. If a value has a product type, then its two parts can be bound to two resources using 
the let* construct; this creates no dependences. If the value is a sum type, then it can be 
case analyzed using the mease construct, which branches according to the outermost form 
of the value and assigns the inner value to a resource; mease creates a dependence on the 
outer form of the value of the resource. The key aspect of the let* and mease is that they 
bind resources rather than ordinary variables. 

Exploring the input to a function via let ! , mease, and let* builds a branch recording 
the dependences between the input and the result of the function. The let ! adds to the 
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Non-memoized 


Memoized 


fib lint -> int 
fun fib (n)= 

if (n < 2) then n 

else fib(n-l) + fib(n-2) 


mf ib: ! int -> int 
mfun mf ib (n' )= 

let !n = n' inreturn ( 
if (n < 2) then n 

else mfib(!(n-l)) + mf ib( ! (n-2) ) ) end 


f : int * int * int -> int 

fun f (x, y, z)= 
if (x > 0) then 

fy y 

else 
fz z 


mf : int * ! int * ! int -> int 
mfun mf (x', y' , z')= 

mif (x' > 0) then 

let !y = y' inreturn (fy y) end 

else 

let !z = z' inreturn (fz z) end 



Figure 1: Fibonacci and expressing partial dependences. 



branch the full value, the mease adds the kind of the sum, and let* adds nothing. Conse- 
quently, a branch contains both data dependences (from let ! 's) and control dependences 
(from mease's). When a return is encountered, the branch recording the revealed depen- 
dences is used to key the memo table. If the result is found in the memo table, then the 
stored value is returned, otherwise the body of the return is evaluated and the memo table 
is updated to map the branch to the result. The type system ensures that all dependences 
are made explicit by precluding the use of resources within return's body. 

As an example consider the Fibonacci function fib and its memoized counterpart mf ib 
shown in Figure 1. The memoized version, mf ib, exposes the underlying value of its param- 
eter, a resource, before performing the two recursive calls as usual. Since the result depends 
on the full value of the parameter, it has a bang type. The memoized Fibonacci function 
runs in linear time as opposed to exponential time when not memoized. 

Partial dependences between the input and the result of a function can be captured by 
using the incremental exploration technique. As an example consider the function f shown 
in Figure 1. The function checks whether x is positive or not and returns fy(y) or fz(z). 
Thus the result of the function depends on an approximation of x (its sign) and on either 
y or z. The memoized version mf captures this by first checking if x' is positive or not and 
then exposing the underlying value of y' or z' accordingly. Consequently, the result will 
depend on the sign of x ' and on either y ' or z ' . Thus if mf is called with parameters (1, 5, 7) 
first and then (2, 5, 3), the result will be found in the memo the second time, because when 
x' is positive the result depends only on y'. Note that mif construct used in this example 
is just a special case of the more general mease construct. 
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3.2 Memo lookups and indexable types 

A critical issue for efficient memoization is the implementation of memo tables along with 
lookup and update operations on them. We support expected constant time memo-table 
lookup and update operations by representing memo tables using hashing. This requires 
that the underlying type r of a modal type !r be an indexable type. An indexable type 
is associated with an injective function, called an index function, that maps each value of 
that type to a unique integer called the index. The uniqueness property of the indices for 
a given type ensTires that two values are equal if and only if their indices are equal. We 
define equality only for indexable types. This enables implementing memo tables as hash 
tables keyed by branches consisting of indices. 

We assume that each primitive type comes with an index function. For example, for 
integers, the identity function can be chosen as the index function. Composite types such 
as lists or functions must be boxed to obtain an indexable type. A boxed value of type r 
has type r box. When a box is created, it is assigned a unique tag, and this tag is used as 
the unique index of that boxed value. For example, we can define boxed lists as follows. 

datatype a blist' = NIL I CONS of a * ((a blist') box) 
type a blist = (a blist') box 

Based on boxes we implement liash-consing as a form of memoization. For example, 
hash-consing for boxed lists can be implemented as follows. 

hCons : \a * \ (a blist) -> a blist 
mfun hCons (h' , t') = 

let !h = h' and !t = t' in 
return (box (CONS(h,t) ) ) 

end 

The function takes an item and a boxed list and returns the boxed list formed by consing 
them. Since the function is memoized, if it is ever called with two values that are already 
hash-consed, then the same result will be returned. The advantage of being able to define 
hash-consing as a memoized function is that it can be applied selectively. 

3.3 Controlling space usage via scoping 

To control space usage of memo tables, we enable the programmer to dispose of memo tables 
by conventional scoping by assoaciating each memoized function with its own memo table. 
When a memoized function goes out of scope, its memo table can be garbage collected. For 
example, in many dynamic-programming algorithms result re-use occurs between recursive 
calls of the same function. In this case, the programmer can scope the memoized function 
inside an auxiliary function so that its memo table is discarded as soon as the auxiliary 
function returns. As an example, consider the standard algorithm for the Knapsack Problem 
ks and its memoized version mks Figure 2. Since result sharing mostly occurs among the 
recursive calls of mks, it can be scoped in some other function that calls mks; once mks 
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Non-memoized 


Memoized 


ks: int * ((int*real) list) -> int 


mks: lint * !((int*real) list) -> int 


fun ks (c,l) = 


mf un mks (c ' , 1 ' ) 




let ! c = c ' and ! 1 = 1 ' inreturn ( 


case 1 of 


case (unbox 1) of 


nil => 


NIL => 


1 (w,v) : :t => 


1 CONS((w,v) ,t) => 


if (c < w) then 


if (c < w) then 


ks(c,t) 


mks( ! c, !t) 


else 


else 


let vl = ks(c,t) 


let vl = mksdc, !t) 


v2 = V + ks(c-w,t) 


v2 = V + mksC ! (c-w) , !t) 


in 


in 


if (vl>v2) then vl 


if (vl > v2) then vl 


else v2 


else v2 


end 


end) end 



Figure 2: Memo tables for memoized Knapsack can be discarded at completion. 



returns its memo table will go out of scope and can be discarded. 

We note that this technique gives only partial control over space usage. In particular it 
does not give control over when individual memo table entries are purged. In Section 6, we 
discuss how the framework might be extended so that each memo table is managed according 
to a programmer specified caching scheme. The main idea is to require the programmer to 
supply a caching scheme as a parameter to the mf un and maintain the memo table according 
to the chosen caching scheme. 

3.4 Memoized Quicksort 

As a more sophisticated example, we consider Quicksort. Figure 3 shows an implementation 
of the Quicksort algorithm and its memoized counterpart. The algorithm first divides its 
input into two lists containing the keys less than the pivot, and greater than the pivot by 
using the filter function f 11. It then sorts the two sublists, and returns the concatenation 
of the results. The memoized filter function mf il uses hash-consing to ensure that there 
is only one copy of each result list. The memoized Quicksort algorithm mqs exposes the 
underlying value of its parameter and is otherwise similar to qs. Note that mqs does not 
build its rcsTilt via hash-consing — it can output two copies of the same result. Since in this 
example the output of mqs is not consumed by any other function, there is no need to do 
so. Even if the result were consumed by some other function, one can choose not to use 
hash-consing because operations such as insertions to and deletions from the input list will 
surely change the result of Quicksort. 

When the memoized Quicksort algorithm is called on "similar" inputs, one would expect 
that some of the results would be re-used. Indeed, we show that the memoized Quicksort 
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Non-memoized 


Memoized 




empty = box NIL 


fiXi iiit~^bool * int list ~> int Xist 


mfili int~^bool * int blist int blist 


fun fil (g: in"t->bool , l:int; list) = 


fun mfil (g,l) = 


case 1 of 


case (unbox 1^ of 


nil => nil 


NIL => empty 


1 h: :t => 


1 CONS(h,t) => 


let tt = fil(g,t) in 


let tt — mfil(g,t) in 


if (g h) then h: :tt 


if (g h) then hCons(h,tt) 


else tt 


else tt 


end. 


end 


qs: int list -> int list 


mqs: ! (int blist) -> int blist 


fun as ("L^ — 


mfun mqs (1' lint blist) — 




let ! 1 = 1 ' inreturn ( 


case 1 of 


case (unbox 1) of 


nil => nil 


NIL => NIL 


1 cons(h,t) => 


1 CONS(h,t) => 


let s = fiKfn x=>x<h,t) 


let s = mfil(fn x=>x<h,t) 


g = fil(fn x=>x>=h,t) 


g = mfil(fn x=>x>=h,t) 


in 


in 


(qs s)@(h: : (qs g) ) 


(mqs !s)@(h::(mqs !g)) 


end 


end) end 



Figure 3: The Quicksort algorithm. 



algorithm computes its result in expected linear time when its input is obtained from a 

previous input by inserting a new key at the beginning. Here the expectation is over all 
permutations of the input list and also the internal randomization of the hash functions used 
to implement the memo tables. For the analysis, we assume, without loss of generality, that 
all keys in the list are unique. 

Theorem 1 

Let L be a list and let L' = [a, L] . Consider running memoized Quicksort on L and then on 
L'. The running time of Quicksort on the modihed list L' is expected 0{n) where n is the 
length of L'. 

Proof: Consider the recursion tree of Quicksort with input L, denoted Q{L), and label 
each node with the pivot of the corresponding recursive call (sec Figure 4 for an example). 
Consider any pivot (key) p from L and let Lp denote the keys that precede p in L. It is easy 
to see that a key k is in the subtree rooted at p if and only if the following two properties 
are satisfied for any key k' E Lp. 

1. Kk' <p then k > k', and 
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2. if k' > p then k < k' . 

Of the keys that are in the subtree of p, those that are less than p are in its left subtree and 
those greater than p are in its right subtree. 

Now consider the recursion tree Q{L') for L' = [a,L] and let p be any pivot in Q{L'). 
Suppose p < a and let k be any key in the left subtree of p in Q{L). Since k < p, hy 
the two properties k is in the left subtree of p in Q{L'). Similarly if p > a then any k 
in the right subtree of p in Q{L) is also in the right subtree of p in Q{L'). Since filtering 
preserves the respective order of keys in the input list, for any p, p < a, the input to the 
recursive call corresponding to its left child will be the same. Similarly, for p > a, the 
input to the recursive call corresponding to its right child will be the same. Thus, when 
sorting L' these recursive calls will find their results in the memo. Therefore only recursive 
calls corresponding to the root, to the children of the nodes in the rightmost spine of the 
left subtree of the root, and the children of the nodes in the leftmost spine of the right 
subtree of the root may be executed (the two spines are shown with thick lines in Figure 4) . 
Furthermore, the results for the calls adjacent to the spines will be found in the memo. 

Consider the calls whose results are not found in the memo. In the worst case, these will 
be all the calls along the two spines. Consider the sizes of inputs for the nodes on a spine 
and define the random variables Xi . . . such that Xi is the least number of recursive 
calls (nodes) performed for the input size to become (|) n or less after it first becomes 

(f)^* ^X or less. Since k < [log4/3n], the total and the expected number of operations 
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along a spine are 



riog4/3"l /o\i-l 




C{n) < J2 

i=l 



n, and 



riog4/3 "1 




E[C{n)] < Yl ^[^^1 



n. 



i=l 



Since the probability that the pivot lies in the middle half of the list is ^, E[Xi] < 2 for 
i > 1, and we have 



riog4/3«i . s 

E[Cin)] < E 2(J 



Thus, i?[C(n)] = 0{n) This bound holds for both spines; therefore the number of operations 
due to calls whose results are not found in the memo is 0{n). Since each operation, includ- 
ing hash-consing, takes expected constant time, the total time of the calls whose results are 
not in the memo is 0(n). Now, consider the calls whose results are found in the memo, 
each such call will be on a spine or adjacent to it, thus there are an expected 0(log?7,) such 
calls. Since, the memo table lookup overhead is expected constant time the total cost for 
these is O(logn). We conclude that Quicksort will take expected 0{n) time for sorting the 
modified list L'. ■ 

This theorem can be extended to show that the 0(n) bound holds for an insertion 
anywhere in the list. Although this bound is better than a complete rerun, which would 
take expected 0(n log n), it is still far from optimal for Quicksort (expected O(logn)). 
It is not known if the optimal bound can be achieved by using memoization only. The 
optimal, however, can be achieved by using a combination of dynamic dependence graphs 
and memoization [Acar et al., 2006b, Acar, 2005]. 

4 The MFL Language 

In this section we study a small functional language, called MFL, that supports selective 
memoization. MFL distinguishes memoized from non-memoized code, and is equipped with 
a modality for tracking dependences on data structures within memoized code. This modal- 
ity is central to our approach to selective memoization, and is the focus of our attention 
here. The main result is a soundness theorem stating that memoization does not affect the 
outcome of a computation compared to a standard, non-memoizing semantics (Section 4.4). 
We also show that the memoization mechanism of MFL causes a constant factor slowdown 
compared to a standard, non-memoizing semantics (Section 4.5). 
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Indexable Types 


V 


: = 1 1 int 1 . . . 


Types 


T 


: = ?7 1 ! 7? 1 Ti X r2 Ti + T2 IJLU.T n -)• T2 


Operators 





. _|. j _ 1 


Expressions 


e 


: = returii(t) let ! xir/bet ineend | 
let ai : Ti X 02 : T2 be t in e end | 

mease t of inl (ai : n ) => ei I inr(a2:r2) 62 end 


Terms 


t 


: = V \ o(ti, . . . ,tn) {ti,t2) mfun/ (a:Ti) :t2 iseend 
tit2\ t \ inl^j+T-^i inri-j+T-ji roll(t) unroll (i) 


Values 


V 


: = a; 1 a ★ 1 n ! 1^2) mfun; f (.a : ti) : T2 iseend 



Figure 5: The abstract syntax of MFL. 



4.1 Abstract syntcix 

The abstract syntax of MFL is given in Figure 5. The meta- variables x and y range over 
a countable set of variables. The meta-variables a and b range over a countable set of 
resources. (The distinction will be made clear below.) The meta-variable I ranges over a 
countable set of locations. We assume that variables, resources, and locations are mutually 
disjoint. The binding and scope conventions for variables and resources are as would be 
expected from the syntactic forms. As usual we identify pieces of syntax that differ only in 
their choice of bound variable or resource names. A term or expression is resource-free if 
and only if it contains no free resources, and is variable-free if and only if it contains no free 
variables. A closed term or expression is both resource-free and variable-free; otherwise it 
is open. 

The types of MFL include 1 (unit), int, products and sums, recursive data types /Uu.r, 
memoized function types, and bang types \ rj. MFL distinguishes indexable types, denoted 
T], as those that accept an injective function, called an index function, whose co-domain 
is integers. The underlying type of a bang type ! rj is restricted to be an indexable type. 
For type int, identity serves as an index function; for 1 (unit) any constant function can 
be chosen as the index function. For non-primitive types an index can be supplied by 
boxing values of these types. Boxed values would be allocated in a store and the unique 
location of a box would serve as an index for the underlying value. With this extension the 
indexable types would be defined as ry : : = 1 | int | r box. Since supporting boxed types 
is well understood, we do not formalize boxing here. 

The abstract syntax is structured into terms and expressions, in the terminology of 
Pfenning and Davies [Pfenning and Davies, 2001]. Roughly speaking, terms evaluate in- 
dependently of their context, as in ordinary functional programming, whereas expressions 
evaluate in the context of a memo table. Thus, the body of a memoized function is an 
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expression, whereas the function itself is a term. Note, however, that the application of a 
function is a term, not an expression; this corresponds to the encapsulation of memoization 
with the function, so that updating the memo table is benign. In a more complete language 
we would include case analysis and projection forms among the terms, but for the sake of 
simplicity wc include these only as expressions. We would also include a plain function 
for which the body is a term. Note that every term is trivially an expression; the return 
expression is the inclusion. 

4.2 Static semantics 

The type structure of MFL extends the framework of Pfenning and Davies [Pfenning and 
Davies, 2001] with a "necessitation" modality, ! 77, which is used to track data dependences 
for selective memoization. This modality does not correspond to a monadic interpretation 
of memoization effects (O''" the notation of Pfenning and Davies), though one could 
imagine adding such a modality to the language. The introductory and eliminatory forms 
for necessity are standard, namely ! t for introduction, and let ! xirybet ineend for elim- 
ination. 

Our modality demands that we distinguish variables from resources. Variables in MFL 
correspond to the "validity" , or "unrestricted" , context in modal logic, whereas resources in 
MFL correspond to the "truth", or "restricted" context. An analogy may also be made to 
the judgmental presentation of linear logic [Pfenning, 1995, Polakow and Pfenning, 1999]: 
variables correspond to the intuitionistic context, resources to the linear context.^ 

The inclusion, return (t), of terms into expressions has no analogue in pure modal 
logic, but is specific to our interpretation of memoization as a computational effect. The 
typing rule for return (t) requires that t be resource- free to ensure that any dependence 
on the argument to a memoized function is made explicit in the code before computing 
the return value of the function. In the first instance, resources arise as parameters to 
memoized functions, with further resources introduced by their incremental decomposition 
using letx and mease. These additional resources track the usage of as-yet-uncxplored 
parts of a data structure. Ultimately, the complete value of a resource may be accessed 
using the let! construct, which binds its value to a variable that may be used without 
restriction. In practice this means that those parts of an argument to a memoized function 
on whose value the function depends will be given modal type. However, it is not essential 
that all resources have modal type, nor that the computation depend upon every resource 
that does have modal type. 

The static semantics of MFL consists of a set of rules for deriving typing judgments of 
the form P; A h t : r, for terms, and P; A h e : r, for expressions. In these judgments P is a 
variable type assignment, a finite function assigning types to variables, and A is a resource 
type assignment, a finite function assigning types to resources. Figure 6 shows the typing 
judgments for terms and expressions. 

^Note, however, that we impose no linearity constraints in our type system! 
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{nx)=r) (A(a)=T) 

r;Ahx:r (^"^^le) ^^Ah^ (resource) 

r;Ahn:int ("""^^er) (unit) 

T;A\-ti:n (1 < i < n) ho o : (n, . . . , t„) r 



r;Aho(ti,...,t„) :t 

T; A h ^1 : Ti F; A h : T2 
r;Ah (^1,^2) :ti xra 

r, f-.Ti r2; A, a:ri \- e : T2 



(primitive) 



F; A h mf un / (o:Ti):r2ise end : n — >■ T2 

r, /:ti -> T2; a, a-.Ti \- e:T2 



(pair) 

(fun) 



F; A h mfun; /(a:Ti) :t2 iseend : n — >■ r2 
F; A h ti : n ^ T2 F; A I- : n 



(fun value) 



F; A h ti t2 : T2 

r;0 h i : ry 
F; A h ! t : ! 7? 



(apply) 



(bang) 



r;Aht:ri r;Aht:r2 

(sum/inl) . I , . ^ . ^ (sum/inr) 



F; A h inlT-^+T-jt : n + T2 F; A h inrT-^+i-^t : n + T2 

F; A h t : [ij,u.t/u\t F; A h f : jiu.T 

(roll) A I . r....,/..i, (unroll) 



F; A h roll(i) : hu.t ^ ' Ah unroll (t) : {\xu.tIu\t 



F;0ht:r 

(return) 



F; A h return(t) : r 

F; A h t : ! F, a;:r/; A heir 
F; A h let ! a; : 77 be t in e end ; r 

F:Aht:riXT2 F; A, ai:ri , 021x2 h e : r 



(let!) 

(letx) 



F; A h let fli :ti X 02 :t2 be t in e end : t 

F; A h t : n + r2 
F; A, ai:ri h ei : r 
F; A, a2:T2 h 62 : r 

(C3.SG) 

F; A h mcaset of inl (ai :ri) => ei I inr (02 :t2) 62 end : r 



Figure 6: Typing judgments for terms (top) and expressions (bottom). 
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(unit) (number) 

(primitive) 



(7, -k^ -k^a G,n JJ.* n, a 

a,ti JJ.* 1)1,0-1 •■• cr„_i,i„ JJ,* D„,a 



c7,o(ti, . . . ,t„) 11* app(o,(vi, . . . ,w„),<T„ 



a,ti JJ.* wi,ct' 

77 (pair) 



{tl,t2) 4* (^'i,W2),a- 
(/ ^ dom((T)) 



a,infuxL/ (a:Ti) :r2 iseend JJ,' mfun/ /(a:ri) :t2 iseend.,c7[i 0] 

G doni((T)) 



(fun) 



(T,infun; /(a:Ti) :r2 iseend mfun/ /(a:ri) :t2 iseend, a 

cr,ti D-' t;i,cri 

01,^2 -1)-* ^2,02 
(72 , /:•. [I'l , !'2 //. «] fJ -U-'' ff' 
(vi = mfun i / (a : Ti ) : r2 is e end) 



(fun val) 



a, ti t2 -IJ.' V, a' 
(7, ! t 4' ! t;,cr' 



(apply) 



7 (bang) 



— (case/inl) — ; (ccise/inr) 



^ n^ g, Unroll fa), , 
7 (^o") r::iZITT77r^TT7 (unroU) 



(T, roll (t) H-* roll (w) , <t' cr, unroll (t) v,a' 



Figure 7: Evaluation of terms. 



4.3 Dynamic semantics 

The dynamic semantics of MFL formalizes selective memoization. Evaluation is parameter- 
ized by a store containing memo tables that track the behavior of functions in the program. 
Evaluation of a function expression allocates an empty memo table and associates it with 
the function. Application of a memoized function is affected by, and may affect, its memo 
table. When the function value becomes inaccessible, so is its associated memo table and 
the storage required for both can be reclaimed. 

Unlike conventional memoization, however, the memo table is keyed by control flow 
information rather than by the values of arguments to memoized functions. This is the key 
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(return/round) 



fT, return(t) v, a 



a{l) = 9 e{(3)t 
a, t JJ.' V, a' 

a'{i) = e' 

(return/not found) 



(T,/:/3,returna) v,a'[l ^ 6»'[/3 ^ v]] 



a,t J|* \v,a' 

(T,Z:^,let ! a; : ?7betiiieend ?;',(t" ^ '' 

cr, < JJ.* fi X U2, cr' 

; J — (letx) 

CF,l:p,\eta\Xa2 be tine end J| v,a 

a,t JJ.* inlri+Ts^^) cr' 
a' , liiiil ■ (3 , [v / ai]ei JJ.^ Ui,cr" 

; (case/inl) 

cr,Z:p, incase t of inl (ai:Ti) ei I inr(o2:T2) => 62 end JJ. Ui , cr 

(T, t J|' inrri+ra^^jCr' 

a', /:inr • /3, [?;/a2]e2 -IJ.® W2,cr" 

; (case/inr) 

tr, l:/3,mcaset of inl (ai :ri) =>■ e\ I inr(a2:r2) ^ 62 end JJ. t'2 , o" 

Figure 8: Evaluation of expressions. 



to supporting selective memoization. Expression evaluation is essentially an exploration of 
the available resources culminating in a resource-free term that determines its value. Since 
the exploration is data-sensitive, only certain aspects of the resources may be relevant to 
a particular outcome. For example, a memoized function may take a pair of integers as 
argument, with the outcome determined independently of the second component in the 
case that the first is positive. By recording control-flow information during evaluation, we 
may use it to provide selective memoization. 

For example, in the situation just described, all pairs of the form (0,v) should map 
to the same result value, irrespective of the value v. In conventional memoization the 
memo table would be keyed by the pair, with the result that redundant computation is 
performed in the case that the function has not previously been called with v, even though 
the value of v is irrelevant to the result! In our framework we instead key the memo table 
by a "branch" that records sufficient control flow information to capture the general case. 
Whenever we encounter a return statement, we query the memo table with the current 
branch to determine whether this result has been computed before. If so, we return the 
stored value; if not, we evaluate the return statement, and associate that value with that 
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branch in the memo table for future use. It is crucial that the returned term not contain any 
resources so that we are assured that its value docs not change across calls to the function. 

The dynamic semantics of MFL is given by a set of rules for deriving judgments of the 
form a,t v,a' (for terms) and a,l:/3,e JJ-^ v,a' (for expressions). The rules for deriving 
these judgments are given in Figures 7 and 8. These rules make use of branches, memo 
tables, and stores, whose precise definitions are as follows. 

A simple branch is a list of simple events corresponding to "choice points" in the eval- 
uation of an expression. 

Simple Event e : : = \v \ inl | inr 
Simple Branch P : : = • \ e ■ P 

We write /9^e to stand for the extension of /3 with the event e at the end. 

A memo table, 9, is a finite function mapping simple branches to values. We write 
9[P ^ v], where /3 ^ dom(^), to stand for the extension of 9 with the given binding for (3. 
We write 9{P) t to mean that /3 ^ dom(6'). 

A store, o", is a finite function mapping locations, I, to memo tables. We write a[l t-^ 9], 
where / ^ dom{a), to stand for the extension of a with the given binding for I. When 
I € dom(cr), we write a[l <— 0] for the store a that maps I to 9 and /' 7^ / to cr{l')- 

Term evaluation is largely standard, except for the evaluation of (memoizing) functions 
and applications of these to arguments. Evaluation of a memoizing function term allocates a 
fresh memo table, which is then associated with the function's value. Expression evaluation 
is initiated by an application of a memoizing function to an argument. The function value 
determines the memo table to be used for that call. Evaluation of the body is performed 
relative to that table, initiating with the null branch. 

Expression evaluation is performed relative to a "current" memo table and branch. 
When a return statement is encountered, the current memo table is consulted to determine 
whether or not that branch has previously been taken. If so, the stored value is returned; 
otherwise, the argument term is evaluated, stored in the current memo table at that branch, 
and the value is returned. The let ! and mease expressions extend the current branch to 
reflect control flow. Since let ! signals dependence on a complete value, that value is 
added to the branch. Case analysis, however, merely extends the branch with an indication 
of which case was taken. The letx construct does not extend the branch, because no 
additional information is gleaned by splitting a pair. 

4.4 Soundness of MFL 

We prove the soundness of MFL relative to a non-memoizing semantics for the language. It 
is straightforward to give a purely functional semantics to MFL by an inductive definition 
of the relations t v and e v, where -u is a pure value with no location subscripts (see, 
for example, [Pfenning and Davies, 2001]). We show that memoization does not affect the 
outcome of evaluation as compared to the non-memoized semantics (Theorem 5). To make 
this precise, we must introduce some additional machinery. 
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The underlying term, t~ , of a term, t, is obtained by erasing all location subscripts on 
function values occurring within t. The underlying expression, e~ , of an expression, e, is 
defined in the same way. As a special case, the underlying value, v~ , of a value, v, is the 
underlying term of v regarded as a term. It is easy to check that every pure value arises 
as the underlying value of some impure value. Note that passage to the underlying term or 
expression obviously commutes with substitution. The underlying branch, , of a simple 
branch, /3, is obtained by replacing each event of the form ! f in /? by the corresponding 
underlying event, ! {v~). 

The partial access functions, t@/3 and e@ P, where P is a simple branch, and t and e 
are variable- free (but not necessarily resource- free), are defined as follows. The definition 
may be justified by lexicographic induction on the structure of the branch followed by the 
size of the expression. 



mease f of inl (ai:ri) ^ ei I inr (021x2) ^ e2end@/3^inl = ei @ /3 
mease f of inl (ai:ri) ^ ei |inr(a2:r2) =^ e2end@/3^inr = 62 @P 

This function will only be of interest in the case that e @ /3 is a return expression, which, 
if well-typed, cannot contain free resources. Note that (e @ = e~ @ P~ , and similarly 
for values, v. 

We are now in a position to justify a subtlety in the second return rule of the dynamic 

semantics, which governs the case that the returned value has not already been stored in 
the memo table. This rule extends, rather than updates, the memo table with a binding for 
the branch that determines this return statement within the current memoized function. 
But why, after evaluation of t, is this branch undefined in the revised store, a'? If the term 
t were to introduce a binding for /3 in the memo table (t{1), it could only do so by evaluating 
the very same return statement, which implies that there is an infinite loop, contradicting 
the assumption that the return statement has a value, v. 



If a, t JJ.* V, a', a{l)@/3 = returnCO, and a{l){/3) is undeGned, then a'{l){(3) is also undeGned. 

An augmented branch, 7, is an extension of the notion of branch in which we record the 
bindings of resource variables. Specifically, the argument used to call a memoized function 
is recorded, as are the bindings of resources created by pair splitting and case analysis. 
Augmented branches are inductively defined by the following grammar: 



t@p = e@/3 
{where t = mf un / (a : ri ) : r2 is e end) 



return(t) @ • 
let ! a; : r be t in e end @ ! v 
let ai : ri X a2 : T2 be i in e end @ P 



return(t) 

[v/x]e @ P 
e@P 



Lemma 2 



Augmented Event e : : = (v) | !w | (^1,^2) | inKv) | inrCv) 
Augmented Branch 7 : : = • | e • 7 
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We write -y^e for the extension of 7 with e at the end. There is an obvious simplifica- 
tion function, 7°, that yields the simple branch corresponding to an augmented branch by 
dropping "call" events, (u), and "pair" events, {vi,V2), and by omitting the arguments to 
"injection" events, inl(w), inr(u). The underlying augmented branch, 7"", corresponding 
to an augmented branch, 7, is defined by replacing each augmented event, e, by its corre- 
sponding underlying augmented event, e^, which is defined in the obvious manner. Note 
that (7°)- = (7-)°- 

The partial access functions e @ 7 and t @j are defined for closed expressions e and 
closed terms t by the following equations: 

t@^^(v) = [t,v/f,a]e@'y 
{where t = mfun/ (a:Ti) :t2 is e end) 

e @ • = e 

let ! ar irbet ineend @ 7^ = [f/x]e@7 

let ai :ri xa2 :t2 be t ineend @ /^^('Ui, ^2) = [vi,V2/a-i,a2]e @ P 

mease t of inl (ai : Ti) =^ ei I inr (02 : 62 end @ /3^inl(i;) = [t;/ai]ei@/3 

incase t of inl (ai :ti) =^ ei I inr (02 :r2) 62 end @ /3^inr = [v/a2]e2@P 

Note that (e @ 7)" = e~ @ 7", and similarly for values, v. 

Augmented branches, and the associated access function, arc needed for the proof of 
soundness. The proof maintains an augmented branch that enriches the current simple 
branch of the dynamic semantics. The additional information provided by augmented 
branches is required for the induction, but it does not affect any return statement it may 
determine. 

Lemma 3 

If e @ 7 = return ft), then e @ 7° = return CO. 

A function assignment, E, is a finite mapping from locations to well- formed, closed, 
pure function values. A function assignment is consistent with a term, t, or expression, 
e, if and only if whenever mfun; /(a:ri) :r2iseend occurs in either t or e, then = 
mfun / (a:ri) :t2 is e~ end. Note that if a term or expression is consistent with a function 
assignment, then no two function values with distinct underlying values may have the 
same label. A function assignment is consistent with a store, a, if and only if whenever 
a{l){f3) = V, then S is consistent with v. 

A store, a, tracks a function assignment, S, if and only if S is consistent with a, 
dom(cr) = dom(S), and for every I G dom(cr), if a{l){/3) = v, then 

1. = return (r), 

2. r %v-. 
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Thus if a branch is assigned a value by the memo table assoeiated with a funetion, it can 
only do so if that branch determines a return statement whose value is the assigned value 
of that branch, relative to the non-memoizing semantics. 
We are now in a position to prove the soundness of MFL. 

Theorem 4 

1. If a, t -II* V, a' , S is consistent with t, a tracks S, 0; h f : r, then t~ -||p v" and there 
exists E' D S such that S' is consistent with v and a' tracks S'. 

2. If a,l:l3,e Jj-^ v^cr', S is consistent with e, a tracks S, 7° = /3, @ 7^ = , and 
0; h e : r, then there exists E' D E such that e~ JJ-p v~ , E' is consistent with v, and 
a' tracks S'. 

Proof: The proof proceeds by simultaneous induction on the memoized evaluation relation. 
We consider here the five most important cases of the proof: function values, function terms, 
function application terms, and return expressions. 

For function values t = mfun//(a:Ti) :r2iseend, simply take E' = E and note that 
V = t and a' = a. 

For function terms t = mf un / (a : ri ) : T2 is e end, note that v = mf un; / (a : ri ) : r2 is e end 
and a' = a[l ^ 0], where I ^ dom(cr). Let E' = E[Z v"], and note that since a tracks 
E, and a{l) = 0, it follows that a' tracks E'. Since E is consistent with t, it follows by 
construction that E' is consistent with v. Finally, since v~ = t~, we have t~ JJ-p v~, as 
required. 

For application terms t = tit2, we have by induction that ti~ Jj-* vi~ and there exists 
El 5 E consistent with vi such that ai tracks Ei. Since vi = mfun^ /(a:ri) :r2iseend, 
it follows from consistency that Ei(Z) = vi~ . Applying induction again, we obtain that 
t2~ JJ-p V2~ , and there exists E2 2 Si consistent with V2 such that C72 tracks E2. It follows 
that E2 is consistent with [^1,^2//, a]e. Let 7 = (^2) • •■ Note that j° = • = j3 and we have 

E2(0@7~ = 'yi~@7~ 

= {vi@jy 

= {[vi,V2/f,a\e)~ 
= [vr,V2'/f,a]e-. 

Therefore, by induction, [vi^ , V2~ / f , a]e^ JJ-p v'^, and there exists E' ^ E2 consistent with 
v' such that a' tracks E'. It follows that (ii t2)~ = ii~ t2~ JJ-p v' , as required. 

For return statements, we have two cases to consider, according to whether the current 
branch is in the domain of the current memo table. Suppose that a, l-.p, return(f) Jj-^ v, a' 
with E consistent with return(t), a tracking E, 7° = f3, E(l) @ 7" = (return(t))~ = 
return(i-), and 0; h return(i) : r. Note that by Lemma 3, (E(l) @ = E(l) @ = 
return ) . 

For the first case, suppose that a{l){/3) = v. Since a tracks E and I G dom{a), we have 
= mfun/ (a:ri) :t2 ise~ end with e~ @ /3~ = return(t~), and t~ Jj-* v''. Note that 
a' = a, so taking E' = E completes the proof. 
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For the second case, suppose that cr(/)(/3) is undefined. By induction t JJ-p f and there 
exists S' 5 S consistent with v such that a' tracks S'. Let 6' = cr'{l), and note 0'{J3) by 
Lemma 2. Let 9" = e'[P ^ v], and a" = a'[l ^ 6"]. Let S" = S'; we are to show that 
S" is consistent with v, and a" tracks S". By the choice of S" it is enough to show that 
S'(Z) @ P~ = return(t~) , which we noted above. 

■ 

The soundness theorem (Theorem 5) for MFL states that evaluation of a program (a 
closed term) with memoization yields the same outcome as evaluation without memoization. 
The theorem follows from Theorem 4. 

Theorem 5 (Soundness) 

If 0, t V, a, where $;$ht:T, then t' v' . 

Type safety follows from the soundness theorem, since type safety holds for the non- 
nicnioizcd semantics. In particular, if a term or expression had a non-canonical value in the 
mcmoized semantics, then the same term or expression would have a non-canonical value 
in the non-memoized semantics, contradicting safety for the non-memoized semantics. 

4.5 Asymptotic complexity 

We show that memoization slows down an MFL program by a constant factor (expected) 
with respect to a standard, non-memoizing semantics even when no results are re-used. The 
result relies on representing a branch as a sequence of integers and using this sequence to 
key memo tables, which are represented with nested hash tables. 

To represent branches as integer sequences we use the property of MFL that the under- 
lying type r/ of a bang type, ! rj, is an indexable type. Since any value of an indexable type 
has an integer index, we represent a branch as sequence of integers corresponding to the 
indices of let ! 'ed values, and zero or one for inl and inr. 

We represent memo tables as nested hash tables. A nested hash table is a tree of hash 
tables consisting of internal hash tables and external hash tables (leaves). Internal hash 
tables map an integer (an index) to another hash table. External hash tables map an integer 
to the result of the function. Given a branch /3 of length m (consisting of m indices), a 
lookup proceeds by indexing each key in order starting at the root of the nested hash table. 
Each lookup except for the last returns a hash table, which is then used for the next lookup 
with the next index. The last lookup returns the desired result in the case of a memo hit, or 
nothing in the case of a memo miss. Since a lookup takes expected constant time, a lookup 
with a branch of length m takes 0(m) time. The same bounds holds for update operations 
(insertions, deletions). 

Theorem 6 

The overhead of an MFL program with respect to a pure, non-memoizing semantics is 
expected 0(1), where the expectation is over internal randomization used for hash tables. 
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Proof: Consider a non-memoizing semantics, where the return rule always evaluates its 
body and neither looks up nor updates memo tables (stores). Consider an MFL program 
and let T denote the time (the number of evaluation steps) it takes to evaluate the program 
with respect to this non-memoizing semantics. Let T' denote the time it takes to evaluate 
the same program with respect to the mcmoizing semantics. In the worst case, no results 
are re-used, thus the difference between T and T' is due to memo-table operations (lookups 
and updates) performed by the memoizing semantics. 

To bound the time for memo table operations, consider a memo-table operation with 
a branch P and let m be the length of the branch. With nested hash tables, the opera- 
tion requires expected Q{m) time. Since the non-memoizing semantics takes Q{m) time 
to build the branch, the overhead of the memo-table operations is expected 0(1). Since a 
branch is used to perform only a constant number of memo-table operations (one lookup and 
one update) we conclude that overhead of selective memoization is 0(1) in expectation. ■ 



5 Implementation 

We describe an implementation of the MFL language as a Standard ML library. Since the 
library cannot differentiate between resources and variables syntactically, it uses a separate 
type for resources. The library therefore cannot enforce statically the aspects of MFL that 
rely on the syntactic distinction between resources; instead it employs run-time checks to 
detect violations of correct usage. ^ 

The interface for the library (Figure 9) provides types for expressions, resources, bangs, 
products, sums, mcmoizcd functions along with their introduction and elimination forms. 
All expressions have type ' a expr, which is a monad with return as the inclusion and var- 
ious forms of bind operations as elimination forms letBang, letx, and mease. A resource 
has type 'a res. The library provides no explicit introduction form for resources. Instead, 
resources are created by letx, incase, mf unjrec, and mfun primitives. The elimination form 
for resources is expose which returns the underlying value of a resources. 

The introduction and elimination form for bang types are bang and letBang. The intro- 
duction and elimination form for product types are pair, and letx and split respectively. 
The letx is a bind operation for the monad expr; split is the elimination form for the term 
context. The treatment of sums is similar to product types. The introduction forms are 
inl and inr, and the elimination forms are mease and choose; mease is a bind operation 
for the expr monad and choose is the elimination for the term context. 

The mfun and mf un_ree primitives introduce memoized functions. The mfun primitive 
takes a function of type 'a res -> 'b expr and returns the memoized function of type 
('a, 'b) marrow; mfun_rec is similar to mfun but it also takes its memoized version as 

^We describe elsewhere a library for Standard ML that can in fact enforce the MFL type system stati- 
cally [Acar et al., 2006a]. The approach, however, does not scale well. 
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signature MEMO = 
sig 

(* Expressions *) 
type 'a expr 

val return : (iinit -> 'a) -> 'a expr 

(* Resources *) 
tjrpe 'a res 

val expose: 'a res -> 'a 

(* Bangs *) 
type ' a bang 

val bang: ('a -> int) -> 'a -> 'a bang 

val letBang:('a bang) -> ('a -> 'b expr) -> 'b expr 

(* Products *) 

type ( ' a, 'b) prod 

val pair: 'a -> 'b -> ('a,'b) prod 

val letx:('a,'b) prod -> (('a res * 'b res) -> 'c expr) -> 'c expr 
val split: ('a, 'b) prod -> (('a * 'b) -> 'c) -> 'c 

(* Sums *) 

type ( ' a, 'b) sum 

val inl:'a -> ('a,'b) sum 

val inr:'b -> Ca.'b) sum 

val mease: Ca, 'b) sum -> ('a res -> 'c expr) -> Cb res -> 'c expr) -> 'c expr 
val choose: ( 'a, 'b) sum -> ('a -> 'c) -> Cb -> 'c) -> 'c 

(* Memoized arrow *) 
type Ca.'b) marrow 

val mfun:('a res -> 'b expr) -> ('a,'b) marrow 

val mf un_rec : ( ( ' a, 'b) marrow -> 'a res -> 'b expr) -> ('a,'b) marrow 
val mapply: ('a, 'b) marrow -> 'a -> 'b 
end 

signature BOX = 
sig 

type 'a box 

val box : ' a -> 'a box 
val imbox : ' a box -> ' a 
val keyOf : ' a box -> int 
end 



Figure 9: The signatures for the memo library and boxes. 

argument. Note that the result type does not contain the "effect" expr — the library encap- 
sulate memoization effects, which are benign, within the function. The elimination form for 
the marrow is the memoized apply function mapply. 

In addition to primitives for memoization, the library provides a facilities for boxing 
and unboxing of values. As described in Section 3 boxes enables injecting ordinary types 
into indexable types. Figure 9 shows the signature for boxes. 

The library implements memo tables as nested hash tables as described in Section 4.5. 
Figure 10 shows the interface for the memo tables. The empty function returns an empty 
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signature MEMO.TABLE = 
sig 

type 'a memotable 

val empty: unit -> 'a memotable 

val extend: 'a memotable -> int list -> Ca option * 'a memotable option) 
val insert: 'a -> 'a memotable -> unit 
end 



Figure 10: The signatures for memo tables. 

memo table. The extend function performs a look up with the provided int (index) list and 
returns a pair consisting of the result (if found) and the extended memo table. The insert 
function inserts the provided result into the specified memo table. 

Figure 11 shows an implementation of the library. The bang primitive takes a value and 
an injective function, called the index function, that maps the value to an integer, called 
the index. The index of a value is used to key memo tables. The restriction that the indices 
be unique enables implementing memo tables using hashing. The primitive letBang takes 
a value b of bang type and a body. It applies the body to the underlying value of b, and 
extends the branch with the index of b. The function letx takes a pair p and a body. It 
binds the parts of the pair to two resources and and applies the body to the resources; as 
with the operational semantics, letx does not extend the branch. The function mease takes 
value s of sum type and a body. It branches on the outer form of s and binds its inner 
value to a resource. It then applies the body to the resource and extends the branch with 
or 1 depending on the outer form of s. The elimination forms of sums and products for 
the term context, split and choose are standard. 

The return primitive finalizes the branch and returns its body as a suspension. The 
branch is used by mfun_rec or mfun, to key the memo table. If the result is found in 
the memo table, then the suspension is disregarded and the result is re-used; otherwise 
the suspension is forced and the result is stored in the memo table keyed by the branch. 
The mf unjrec primitive takes a recursive function f as a parameter and "memoizes" f by 
associating it with a memo table. A subtle issue is that f must calls its memoized version 
recursively. Therefore f must take its memoized version as a parameter. Note also that the 
memoized function internally converts its parameter to a resource before applying f to it. 

The implementation described here does not check for correct usage. To incorporate 
the run-time checks for correct usage, we need a more sophisticated definition of resources 
in order to detect when a resource is exposed out of its context (i.e., function instance). 
In addition, the interface must be updated so that the first parameter of letBang, letx, 
and mease, occurs in suspended form. This enables updating the state consisting of certain 
flags before forcing a term. 

Figure 12 shows the examples from Section 3 written in the SML library. The examples 
assume a Box structure that ascribes to the BOX signature (Figure 9). The hCons function 
follows the description closely. The Fibonacci function mf ib applies its argument to a fresh 
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functor BuildMemo (structure MemoTable :MEMO_TABLE) :MEMO = 
struct 

type 'a expr = int list * (unit -> 'a) 
fun return f = (nil.f) 

type 'a res = 'a 
fun res v = v 

fun expose r = r 

type 'a bang = 'a * ('a -> int) 
fun bang h t = (t,h) 
fim letBang b f = 

let val (v,h) = b 

val (branch, susp) = f v 

in ((h v) :: branch, susp) end 

type ('a,'b) prod = 'a * 'b 
fun pair x y = (x,y) 
fun split p f = f p 

fun letx (p as (xl,x2)) f = f (res xl, res x2) 

datatype ('a,'b) sum = INL of 'a I INR of 'b 
fun inl V = INL(v) 
fun inr v = INR(v) 
fun mease s f g = 

let val (Ir, (branch, susp)) = case s of 

INL V => (0,f (res v)) 
I INR V => (l,g (res v)) 

in 

(Ir: : branch, susp) 
end 

fun choose s f g = case s of INL v => f v I INR v => g v 

type ('a,'b) marrow = 'a -> 'b 
fim mfun_rec f = 

let val mtable = MemoTable . empty 
f^xn. mf rf X = 

let val (branch, susp) = f rf (res x) 

val result = case MemoTable . extend mtable branch of 

(NONE, SOME mtable') => (* Not found *) 

let val V = susp () 

val _ = MemoTable . insert v mtable' 
in v end 

I (SOME v,NONE) => v (* Found *) 

in result end 
fun mf ' X = mf mf ' x 

in 

mf ' 

end 

fun mfun f = . . . (* Similar to mfun_rec *) 
fim mapply f v = f v 
end 



Figure 11: The implementation of the memoization hbrary. 
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struct 




typs ' £1 box — ' £L Box . Idox 




fun iB V = bang (fn i => i) v 




lun Dtj D — Dang vin d — ^ oox.Keyux uj o 




(** Boxed lists **) 




datatype 'a blist' = NIL I CONS of ('a * (('a blist') box)) 




type 'a blist = ('a blist') box 




(** Hash-cons **) 




f^xn. hCons' (x') = letx (expose x') (fn (h'.f) => 




letBang (expose h') (fn h => letBang (expose t') (fn t 


=> 


return (fn()=> box (CONS(li,t)))))) 




fun hCons x = mapply (mf un hCons ' ) x 




(** Fibonacci **) 




fun mfib' f (n') = 




letBang (expose n') (fn n => 




return (fn()=>if n < 2 then n else (mapply f (iB(n-l))) + (mapply f (iB(n-2))) 


fun mfib n = mapply (mfun_rec mfib') n 




(** Knapsack **) 




fun mks ' mks (arg) = 




letx (expose arg) (fn (c',1') => 




letBang (expose c') (fn c => 




letBang (expose 1') (fn 1 => return (fn () => 




case (unbox 1) of 




NIL => 




1 CONS((w,v) ,t) => if (c < w) then mapply mks (pair (iB c) (bB t)) 




else let val vl = mapply mks (pair (iB c) (bB t)) 




val v2 = V + mapply mks (pair (iB (c-w)) 


(bB t)) 


in if (vl > v2) then vl else v2 end)))) 




val mks x = mfun_rec mks' 




(** Quicksort **) 




fun mqs () = 




let val empty = box NIL 




val hCons = mfim hCons' 




fun fil f 1 = 




case (unbox 1) of 




NIL => empty 




1 CONS(h,t) => if (f h) then (mapply hCons (pair (iB h) (bB (fil 


f t)))) 


else (fil f t) 




fun qs' qs (1') = letBang (expose 1') (fn 1 => return (fn () => 




case (unbox 1) of 




NIL => nil 




1 CONS(h,t) => let val 11 = fil (fn x=>x<h) t 




val gg = fil (fn x=>x>=h) t 




val sll = mapply qs (bB 11) 




val sgg = mapply qs (bB gg) 




in sll@(h: :sgg) end)) 




in mfun_rec qs' end 




end 





Figure 12: Examples from Section 3 in the SML library. 
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memoized instance of the Fibonacci function (mf ib'). As a results mf ib allocates a memo 
table every time it is called. Since this table is shared by all calls to mfib', mfib runs 
in linear time. When mfib finishes, this table can be garbage collected. As with mfib, 
the Knapsack function mks also applies its argument to a fresh memoized instance of the 
memoized Knapsack function (mks'). Therefore, when mks returns, the allocated memo 
table can be garbage collected. For Quicksort, we provide a function mqs that returns an 
instance of memoized Quicksort when applied. Each such instance has its own memo table. 
Note also that mqs creates a local instance of the hash-cons function so that each instance 
of memoized Quicksort has its own memo table for hash-consing — this table can be garbage 
collected when mqs returns. 

In the examples, we do not use the sum types provided by the library to represent boxed 
lists, because ML sum types suffice for the considered examples. In general, one will use the 
provided sum types instead of their ML counterparts (for example if an mease is requires). 
The examples in Figure 12 can be implemented using the following definition of boxed lists. 

datatype 'a boxlist' = ROLL of (unit, (('a, 'a boxlist' box) prod)) sum 
type 'a boxlist = ('a boxlist') box 

Changing the code in Figure 12 to work with this definition of boxed lists requires several 
straightforward modifications. 

6 Discussion 

Space and cache management. Our framework associates a separate memo table with 
each memoized function. This allows the programmer to control the life-span of memo 
tables by conventional scoping. This somewhat coarse degree of control is sufficient in 
certain applications such as in dynamic programming, but finer level of control may be 
desirable for applications where result re-use is less regular. Such an application can benefit 
from specifying a caching scheme for individual memo tables so as to determine the size 
of the memo table and the replacement policy. We discuss how the framework can be 
extended to associate a cache scheme with each memo table and maintain the memo table 
accordingly. 

The caching scheme should be specified in the form of a parameter to the mf un con- 
struct. When evaluated, this construct will bind the caching scheme to the memo table and 
the memo table will be maintained accordingly. Changes to the operational semantics to 
accommodate this extension is small. The store a will now map a label to a pair consisting 
of a memo table and its caching scheme. The handling of the return will be changed so 
that the stores do not merely expand but are updated according to the caching scheme 
before adding a new entry. The following shows the updated return rule. Here S denotes 
a caching scheme and 6 denotes a memo table. The update function denotes a function 
that updates the memo table to accommodate a new entry by possibly purging an existing 
entry. The programmer must ensure that the caching scheme does not violate the integrity 
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of the memo table by tampering with stored values. 



a{i) = {e,s) e{/3) = v 

cr, return (f) JJ-^ v,a 



(Found) 



a{l) = {9,S) e{P)t 

(j'd) = {0'. S) e" = update(^^'. S. (3. v)) 



(Not Found) 



(7,/:^,return(t) v,a'[l ^ 9"] 



For example, we can specify that the memo table for the Fibonacci function, shown in 
Figure 1, can contain at most two entries and be managed using the least-recently-used 
replacement policy. This is sufficient to ensure that the memoized Fibonacci runs in linear 
time. This extension can also be incorporated into the type system described in Section 4. 
This would require that we associate types with memo stores and also require that we 
develop a type system for "safe" update functions if we wish to enforce that the caching 
schemes are safe. 

Local versus non-local dependences. Our dependence tracking mechanism only 
captures "local" dependences between the input and the result of a function. A local 
dependence of a function f is one that is created inside the static scope of f . A non- 
local dependence of f is created when f passes its input to some other function g, which 
examines f 's input indirectly. In previous work, Abadi et. al. [Abadi et al., 1996] and 
Heydon et. al. [Heydon et al., 2000] showed program analysis techniques for tracking non- 
local dependences by propagating dependences of a function to its caller. They do not 
discuss, however, efficiency implications of tracking non-local dependences. 

Our framework can be extended to track non-local dependences by introducing an ap- 
plication form for memoized functions in the expression context. This extension would, for 
example, allow for dependences of non-constant length. We chose not to support non-local 
dependences because it is not clear if its utility exceeds its efficiency effects. 

7 Conclusion 

We present language techniques for applying memoization selectively under programmer 
control. The approach makes explicit the performance effects of memoization and yields 
programs whose running times can be analyzed using standard techniques. A key aspect of 
the framework is that it can capture both control and data dependences between input and 
the result of a memoized function. We show that the approach accepts a relatively simple 
implementation by giving an implementation as a library for the Standard ML language. 
The main contributions of the paper arc the particular set of primitives we suggest and the 
semantics along with the proofs that it is sound. We expect that the techniques can be 
implemented in any purely-functional language. 
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